A study on font-family and font-size recognition applied to Arabic word images at ultra-low resolution
نویسندگان
چکیده
In this paper, we propose a new font and size identification method for ultra-low resolution Arabic word images using a stochastic approach. The literature has proved the difficulty for Arabic text recognition systems to treat multi-font and multi-size word images. This is due to the variability induced by some font family, in addition to the inherent difficulties of Arabic writing including cursive representation, overlaps and ligatures. This research work proposes an efficient stochastic approach to tackle the problem of font and size recognition. Our method treats a word image with a fixed-length, overlapping sliding window. Each window is represented with a 102 features whose distribution is captured by Gaussian Mixture Models (GMMs). We present three systems: (1) a font recognition system, (2) a size recognition system and (3) a font and size recognition system. We demonstrate the importance of font identification before recognizing the word images with two multi-font Arabic OCRs (cascading and global). The cascading system is about 23% better than the global multi-font system in terms of word recognition rate on the Arabic Printed Text Image (APTI) database which is freely available to the scientific community. 2012 Elsevier B.V. All rights reserved.
منابع مشابه
Arabic font recognition based on diacritics features
Many methods have been proposed for Arabic font recognition, but none of them has considered the specialty of the Arabic writing system. Most of these methods are either general pattern recognition approaches or application of other methods which have been developed for languages other than Arabic. Therefore, this paper is the first attempt to present an alternative method for Arabic font recog...
متن کاملUltra Bessel sequences in direct sums of Hilbert spaces
In this paper, we establish some new results in ultra Bessel sequences and ultra Bessel sequences of subspaces. Also, we investigate ultra Bessel sequences in direct sums of Hilbert spaces. <span style="font-family: NimbusRomNo9L-Regu; font-size: 11pt; color: #000000; font-s...
متن کاملBenchmarking Strategy for Arabic Screen-Rendered Word Recognition
This chapter presents a new benchmarking strategy for Arabic screenbased word recognition. Firstly, we report on the creation of the new APTI (Arabic Printed Text Image) database. This database is a large-scale benchmarking of open-vocabulary, multi-font, multi-size and multi-style word recognition systems in Arabic. Such systems take as input a text image and compute as output a character stri...
متن کاملINVESTIGATION OF BARRIERS AND REQUIREMENTS AFFECTING E-SHOPPING BEHAVIOR OF CUSTOMERS IN THE BOOK MARKET
<span style="color: #000000; font-family: Tahoma, sans-serif; font-size: 13px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: justify; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; display: inline !important; float: none; backgro...
متن کاملINVESTIGATION OF BARRIERS AND REQUIREMENTS AFFECTING E-SHOPPING BEHAVIOR OF CUSTOMERS IN THE BOOK MARKET
<span style="color: #000000; font-family: Tahoma, sans-serif; font-size: 13px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: justify; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; display: inline !important; float: none; backgro...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Pattern Recognition Letters
دوره 34 شماره
صفحات -
تاریخ انتشار 2013